System Combination Using Discriminative Cross-Adaptation

نویسندگان

  • Jacob Devlin
  • Antti-Veikko I. Rosti
  • Sankaranarayanan Ananthakrishnan
  • Spyridon Matsoukas
چکیده

Cross-adaptation (CA) based methods of machine translation (MT) system combination work by adapting the decoding step of a baseline system using information from alternate systems. Generally, the required information is very deep, such as a full decoding forest. In this paper, we describe a method of cross-adaptation based system combination which only requires the final output from each alternate system. This is achieved by adding a discriminatively weighted n-gram confidence feature to our decoder. In order to optimize the confidence weight of each system, we present a novel procedure called non-linear Expected-BLEU optimization that can be used to optimize arbitrary nonlinear parameters for any decoding feature. We also describe a method for explicitly creating an adapted system that is dissimilar from each particular input system, which we have found to be useful in combination. Although our new method does not outperform a state-of-the-art confusion network (CN) based combination system on its own, we obtain statistically significant gains of 0.21-0.45 BLEU when the CA output is used as an additional system in CN combination. ∗This work was supported by DARPA/I2O Contract No. HR0011-06-C-0022 under the GALE program (Approved for Public Release, Distribution Unlimited). The views, opinions, and/or findings contained in this article/presentation are those of the author/presenter and should not be interpreted as representing the official views or policies, either expressed or implied, of the Defense Advanced Research Projects Agency or the Department of Defense.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Advances in Mandarin broadcast speech recognition

We describe our continuing efforts to improve the UW-SRI-ICSI Mandarin broadcast speech recognizer. This includes increasing acoustic and text training data, adding discriminative features, incorporating frame-level discriminative training criterion, multiplepass acoustic model (AM) cross adaptation, language model (LM) genre adaptation and system combination. The net effect without LM adaptati...

متن کامل

Discriminative adaptation based on fast combination of DMAP and dfMLLR

This paper investigates the combination of discriminative adaptation techniques. The discriminative Maximum A-Posteriori (DMAP) adaptation and discriminative feature Maximum Likelihood Linear Regression (DfMLLR) are examined. Since each of the methods is proposed for distinct amount of adaptation data it is useful to combine them in order to preserve the systems performance in situations with v...

متن کامل

Semi-Supervised Representation Learning for Cross-Lingual Text Classification

Cross-lingual adaptation aims to learn a prediction model in a label-scarce target language by exploiting labeled data from a labelrich source language. An effective crosslingual adaptation system can substantially reduce the manual annotation effort required in many natural language processing tasks. In this paper, we propose a new cross-lingual adaptation approach for document classification ...

متن کامل

Regularized feature-space discriminative adaptation for robust ASR

Model-space adaptation techniques such as MLLR and MAP are often used for porting old acoustic models into new domains. Discriminative schemes for model adaptation based on MMI and MPE objective functions are also utilized. For feature-space adaptations, one extension to the wellknown feature-space discriminative training (fMPE) algorithm, feature-space discriminative adaptation, was recently p...

متن کامل

Optimization on Vietnamese large vocabulary speech recognition

This paper summarizes our latest efforts toward a large vocabulary speech recognition system for Vietnamese. We describe the Vietnamese text and speech database which we collected as part of our GlobalPhone corpus. Based on these data we improve our initial Vietnamese recognition system [1] by applying various state-of-the art techniques such as semi-tied covariance and discriminative training....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011